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f^**) Abstract 

In this paper, we introduce new lower bounds on the distortion of scalar fixed- 
rate codes for lossy compression with side information available at the receiver. These 
bounds are derived by presenting the relevant random variables as a Markov chain and 
applying generalized data processing inequalities a la Ziv and Zakai. We show that by 
replacing the logarithmic function with other functions, in the data processing theorem 
we formulate, we obtain new lower bounds on the distortion of scalar coding with side 
I information at the decoder. These bounds are shown to be better than one can obtain 

from the Wyner-Ziv rate-distortion function. 

Index Terms: side information, Wyner-Ziv problem, Ziv-Zakai bounds, source 
coding, on-line schemes, scalar coding, Renyi entropy, Rate-Distortion the- 
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1 Introduction 



The Wyner-Ziv (WZ) problem has received very much attention during the last three 
decades. There were several attempts to develop practical schemes for lossy coding in the 
WZ setting, by using codes with certain structures that facilitate the encoding and the 
decoding. Most notably, these studies include nested structures of linear coset codes (in the 
role of bins) for discrete sources, and nested lattice structures for continuous valued sources, 
see e.g., [2], [3]. Other directions of introducing structure into WZ coding are associated 
with trellis/turbo/LDPC designs ([I] and references therein) and with progressive coding, 
i.e., successive refinement with layered code design [5], [6j. The case of scalar source codes 
for the WZ problem was also handled in several papers, e.g. [7] and [8]. Zero-delay coding 
strategies for the WZ problem, were introduced in [9], where structure theorems for fixed- 
rate codes, under the assumption of a Markov source, were given. These results were later 
extended in [10], to include variable-rate coding. In and |12j it was conjectured that 
under the high-resolution assumption, the optimal quantization level density is periodic. In 
addition, zero-delay schemes for specific source-side information correlation were presented 
in [11] , [12] and [13] . Zero-delay coding of individual sequences under the conditions of the 
WZ problem was considered in p3] , where existence of universal schemes for fixed-rate and 
variable-rate coding was established. 

In this paper, we develop lower bounds on the distortion in the scalar WZ setting. We 
generalize the results of [15] and [16] , concerning functionals satisfying a data-processing the- 
orem, to this setting. In |15] it was shown that the rate-distortion (RD) bound (R(D) < C) 
remains true when the negative logarithm function, in the definition of mutual informa- 
tion, is replaced by an arbitrary convex, non-increasing function satisfying some technical 
conditions. For certain choices of this convex function, the bounds obtained were better 
than the classical RD bounds. These results were substantially generalized in [16J to apply 
to even more general information measures. The methods of [15] were also used in |17] , 
|18] and |19| . In these papers, lower bounds on the distortion of delay-constrained joint 
source-channel coding were given. These bounds were obtained by combining the Renyi 
information measure [20] with the generalized data processing theorem of |15j . and under 
high-resolution and high SNR approximations. Another related work is [21], where certain 
degrees of freedom of the Ziv-Zakai generalized mutual information were further exploited 
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in order to get better bounds. 

We start by presenting the relevant random variables of the WZ problem as a Markov 
chain. Then, using a data processing theorem, we obtain lower bounds on the distortion. 
We show that replacing the logarithmic function by other functions, may give better bounds 
on the distortion of delay-limited coding (in particular, for scalar coding) in the WZ setting. 
Examples of non-trivial lower bounds for scalar coding, in this setting, are obtained using 
the convex function Q(t) = t 1-a , a > 1, which is equivalent to using the Renyi information 
measure. The importance of such bounds stems from the fact that finding the optimal scalar 
code in the WZ setting is, in general, a hard problem. In fact, it is a problem of finding an 
optimal partition of the source alphabet and this partition does not necessarily correspond 
to intervals. A main objective will be to use these bounds for studying the performance of 
concrete coding schemes. 

The remainder of the paper is organized as follows. In Section 2, we present our formu- 
lation of the WZ problem and establish a generalized data processing theorem (DPT) for 
this setting. In Subsection 2.1, we define the fixed-rate scalar coding case. We then give 
an upper bound on the generalized capacity, which is one component of the above DPT. 
In Subsection 2.2, we handle the second component of the generalized DPT, i.e., the gen- 
eralized RD function. We start with a general characterization of this function. Then, we 
introduce a closed-form expression of the generalized RD function for uniformly distributed 
sources w.r.t. general symmetric distortion measures. In Section 3, we use the results of 
Section 2 to obtain non-trivial lower bounds on the distortion of scalar coding in the WZ 
setting in several cases. Finally, we demonstrate that for large alphabets, non-trivial bounds 
can be derived for various channels and as a result, the performance range for scalar coding 
can be given. 

2 Problem Formulation and Results 

In this section, we present the relevant random variables of the WZ problem as a Markov 
chain and establish a generalized data processing theorem (DPT) for this setting, using the 
method of [IS] . 

We begin with notation conventions. Capital letters represent scalar random variables, 
specific realizations of them are denoted by the corresponding lower case letters and their 
alphabets - by calligraphic letters. The inner product of the two vectors a and b will be 
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denoted by a ■ b. Logarithms are denned to the base 2. 

We consider a memoryless source producing a random sequence X±,X2, ■■■ Xi G X, 
i = l,2,..., where X is a finite alphabet with cardinality K. Without loss of generality, we 
define this alphabet to be the set {1,2,..., K}. The probability mass function of X, p(x), is 
known. A fixed-rate scalar source code with rate R = log partitions X into M disjoint 
subsets (Ai, A2, . . . , Am), M < K. The encoder maps Xi into a channel symbol Zi, using 
a function / : X 2, . . . , M}, that is, Zi = f{Xi). The decoder, in addition to Zi, has 
access to a random variable Y%, which is dependent on Xi via a known discrete memoryless 
channel (DMC), defined by the single-letter transition probability matrix {p(y\x)}, whose 
entries are the conditional probabilities of the different channel output symbols given the 
channel input symbols. Based on Zi and Yi, the decoder produces the reconstruction Xi, 
using a decoding function g : {1,2, . . . , M} x X — > X, i.e., Xi = g(Zi,Yi). This setting is 
depicted in Fig. [TJ For simplicity, we assume that Xi, Yi and Xi, all take on values in the 
same finite alphabet X. The distortion in this setting is defined to be: 

D = Ep(X u X t ) = Y,p(x,y)p(x,x) (1) 

x,y 

where p(x, y) is the joint distribution of x and y and p{x, x) is a distortion measure. 

Let Q(t), < t < 00, be a real-valued convex function, where limt • Q(l/t) = 0. 

t-»o 

This requirement implies that Q(t) is non-increasing, as was shown in [15J. We define 
• Q(r/0) = 0, for all < r < 00. The generalized mutual information relative to the 
function Q is defined as 

We apply the generalized DPT \15\ Theorem 3] in the following way: 

I Q (X;X) < I Q (X;Y,Z) (3) 

where we have used the fact that X <-> (Y, Z) -H- X is a Markov chain. Since Z •<->■ X •<-)■ Y 
is also a Markov chain, we have: 

p{x, y, z) = p(x)p(y\x)p(z\x) (4) 

1 Through this paper, the symbols of Z are not necessarily transformed into bits. Therefore, log M need 
not necessarily be an integer. 
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Figure 1: The WZ setting 

and I Q (X;Y,Z) is given by: 



x,y,z 



p(y)p(z\y) 



x,y,z 



p(y,z\x) 

p{y)p(z\y) 



Y^P(x)p{y\x) P {z\x)Q Kp(ylx)p{zlx) 

^^p{x)p{y\x)p{ 

V 



Z X) 



^2 P{x)p(y\x)p(z\x)Q 



x,y,z 



p(y\x)p(z\x 
^2^2^2p(x)p{y\x)p(z\x)Q 1 



p(y\x)p(z\x) 



where we have defined the following iT-dimensional vectors {pz\Y=i- 

p z = \p(z\x),x G X] 
and the following .fT-dimensional vectors {p y }, y £ X: 

Py = \p(x,y),x € X}. 
By definition of {pz} z= \i we have the following property: 

M 

J> = [i,i,...,i]. 

z=l 

We now define the following functions {G y (p z )}, y € X: 

G y (Pz) = ^p(x)p(y\x)p(z\x)Q { p{y P f x) ^ zlx] 



Using these functions, Eq. Q becomes: 

I®(X;Y,Z) = ( 10 ) 

y z 

The functions G y (p z ) have the following property: 

Lemma 1. For any convex function Q, the functions {G y (p z )}, y £ X, are convex. 

The proof is given in Appendix A. This convexity property has important implications 
in the optimization of I®(X; Y, Z), as will be discussed later. Assuming the encoder is given 
by a deterministic function / : X — > {1, . . . , M}, Eq. Q becomes the following: 

y x z 

= ^z^p{x)p{y\x)Q 



y x 



j-,y 



p(y\x 

v / 



(11) 



where z = f(x) and A z = {x : f(x) = z}. Remember that we have defined • Q(r/0) = 0. 
Using Q(t) = — logt in (12), thus turning back to the classical DPT, we next show the 
following result: 

R(D) -I(X;Y) < sup H(Z\Y) (12) 

where R(D) is the classical RD function and the supremum is taken over all partitions of 
X into M disjoint subsets. This inequality stems from the Markov properties of the WZ 
problem we discussed before. We see that given a rate R = logM, we should find the 
encoder that maximizes H(Z\Y). This is not surprising as, intuitively, we want the amount 
of information that Y has on Z to be as little as possible, to decrease the redundancy. 
Ideally, we want the encoder output and the side information to be independent. This 
is indeed achieved by the block coding scheme of Wyner and Ziv, in the limit of infinite 
block length. The term sup {H(Z\Y)} + I(X; Y) will be referred to as the "capacity" of the 
generalized channel between (A, Z) and Y. This channel is composed of the DMC between 
X and Y and a noiseless channel with capacity logM for the encoder's output. Since the 
source distribution is given, the maximum rate of reliable communication over this channel 
is indeed I(A; Y) + sup H{Z\Y). 
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Proof of Eq. (12). Using the function Q(t) = — logi in (12), we get: 

I Q (X;Y,Z) = H(Y,Z) - H(Y,Z\X) 

= H(Y) + H(Z\Y) — H(Y\X) — H(Z\Y, X) 

= I(X;Y)+H(Z\Y) (13) 

where we have used the fact that H{Z\Y,X) = H{Z\X) = since Z is a deterministic 
function of X. On substituting into ([3]), we get: 

R(D) < I Q (X;X) 

< I Q (X;Y,Z) 

= I(X;Y)+H(Z\Y) 

< I(X;Y) + sup H(Z\Y), (14) 



which is equivalent to Eq. (12). □ 
Notice that if we allow non-deterministic encoders, as in Eq. ([6]), we get the following: 
R(D) < I(X;Y) + sup {H(Z\Y) - H(Z\X)} , (15) 



where the supremum is taken over the same set as in (12). Although randomizing the 
encoder can increase H(Z\Y), it will also increase H(Z\X). Due to the convexity property 
presented in Lemma [TJ the supremum is achieved by a deterministic encoder, as will be 
discussed in the next section. Therefore, randomizing the encoder cannot improve the 
bound in this setting. 

In Section 3, we show some examples of scalar coding, where this result gives us lower 
bounds on the distortion, which are better than the bounds obtained from the classical 
inequality Rwz(D) < logM, where Rwz(D) is the WZ RD function. 

2.1 Generalized DPT for fixed-rate scalar coding 

Assuming a deterministic encoder, the vectors {p z }^Li, defined in (jfjj), become: 

p z = [li £ A z ,l2eA z ,---,lKeAj, (16) 
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where Is is the indicator function of the event B. The jth coordinate of p z is 1 if j € A z 



and elsewhere. Using these vectors, we can rewrite (12) in the following way: 

'Pz(x) -Py 



I Q (X;Y,Z) = £>(x,y)Q 



x,y 



p(y\x 



y z xgA z 



p(y\x) 



y 2 



(17) 



y 2 



where we have defined the following i^-dimensional vectors: 



iz,y 



p{x\,y)Q ( -4 , p \ J ,p(x 2 ,y)Q ( ^M^r 



(18) 



and the set of functions {T y }^ =1 , T y : 



»K 



T y{Pz) =P. 



z Hz,y 



(19) 



Notice that the vector p y depends only on y and that the inner product p z ■ p y is a function 
of z and y. Applying the RD bound [15, Theorem 4], we get: 



R Q (D) < I Q (X; X) < I Q (X; Y, Z) < C Q , 



(20) 



where 



and 



R Q (D) = miI Q (X;X) 

C Q = sup/ Q (X;y,Z) 

= sup^2p(y)J2 T y(Pz)- 

y z 

This gives us the following lower bound on the distortion D: 

D > D Q {C Q ), 



(21) 



(22) 



(23) 



where D®(R) is the inverse function of R®(D). The infimum is taken over all conditional 
distributions {p(x|x)} that satisfy the distortion constraint ~Kp(X,X) < D. The supremum 
should be taken over all scalar encoders with a fixed rate R = logM. Alternatively, we can 



carry out a continuous optimization by taking the supremum over all sets of positive vectors 
{Pz}zLi that satisfy (Jij), i.e., all conditional distributions Whereas the original 

optimization problem may require exhaustive search over all encoders, and in this case, our 
mechanism is useless, the continuous problem may have analytic solution. The result of 
the continuous optimization will, of course, be greater than or equal to C*. However, the 
functions {T y (p z )} might be neither convex nor concave. In this case, we can carry out the 
optimization using the general form of I®(X; Y, Z) given in (10), which is convex in p z . 



Until now, we only handled fixed-rate codes. However, distortion lower bounds for codes 
created by time-sharing fixed-rate codes are readily obtained from the above. This can be 
seen as follows: For a given rate R S R = {log 1, log 2, . . . , log K}, let D(R) be the minimum 
distortion achievable by fixed-rate scalar codes with encoders / : X — > {1,2, .. . ,2^} and 
let D_{R) be a lower bound on this distortion. We construct a variable-rate code whose rate 
at time t is Rt, Rt £ t = 1, . . . , n, by time-sharing scalar fixed-rate codes, under the 
constraint: 



1 - 

-Y^Rt<R. (24) 



n 
t=l 



The distortion of this time-sharing code is lower bounded by: 

D > 



> 



> 



1 

t=i 
1 n 

t=i 
1 n 
n ^ 

■•(£*) 



> D* 

> D*{R), (25) 
where D*(R) is the lower convex envelope of the set {D(R)}ReR an d is defined by: 

K 

D* (R) = min ^ /3 id D(log i) (26) 
i=i 

where the minimum is taken over the following set: 

K K 

{Pufa-.^Pic: ft>0 Vi, ^fii = l, ^Alogi<i?} (27) 

i=\ i=l 



We see that D*(R) lower bounds the distortion of any such time-sharing code with rate no 
more than R. Concrete examples of this result will be given in the next section. 

We end this subsection with an upper bound on for the specific convex function 
Q(t) = t 1 ^ , a > 1. Using this choice of Q is equivalent to using the Renyi mutual 
information of order a, which is defined as I2UI : 

l-a 

a - J ^— ' p(x\y) 



-J— log/Q(X;y). 
a — 1 



Thus, Eq. (20) can be written in the following equivalent form: 

K(d) < r a (x;x) < r a (x-,Y,z) < c: 

where 

K(d) 

and 



-J—logfl«(D) 
a — 1 



^— logC«. 
a — 1 



(28) 



(29) 



(30) 



(31) 



The logarithmic measure is a special case, obtained for a — > 1. Thus, optimizing over a 
can only improve the classical bounds. In addition, the function Q(t) = t l ~ a is relatively 
convenient to work with. 

Lemma 2. For the convex function Q(t) = i 1_a , 1 < a < 2, we have the following upper 
bound: 

' " ' • (32) 



s \ i / 



Proof. Using the function Q(t) = t a in (17), we get: 



lQ(X;Y,Z) = EEE^y) 



~* \ l-a 
Pz-Py 



y z xeAz 
y z xeA z 



y z 



EE 

y z 



^2p(x,y)l x( z Az 



l-a 



^2p(x)p(y\x) a l x£Az 



(33) 
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In order to get an upper bound on I Q (X;Y,Z), we define: 

q = l/(a-l) 
r = 1/(2 -a) 



(34) 



(35) 



and the following .fT-dimensional vectors: 

a y = [p{xi,y) a - 1 ■ l leAz ,p(x 2 ,y) a ~ l ■ l2eA z , ■ • •] , 

b y = [p{xi) 2 ~ a ■ p(y\xi) ■ l 1&Az ,p(x 2 ) 2 ~ a ■ p(y\x 2 ) ■ h&A z , 
Applying these definitions to ( |33[ ), we have: 

-1/9 

^ • 6j, ) . (36) 



I^X;Y,Z) = EEfE"^) 
y z \k=i J 



Assuming a is in the range 1 < a < 2, we have 1 < q, r < oo and 1/q + 1/r = 1. Thus we 
can apply the Holder inequality to each term in the sum, in the following way: 

/ K \ -Vs /if \ 1 / r 

(%-^)-(E<J ^(XX*J (37) 

We then have: 

jfi{X;Y,Z) < EE(XX*1 7 

= EE {p z ' 'KyM^K 3 ^) ■Kyka)^",- • •]) 

y 2 

\ ^ \ ^ 1 / i i \2— a 

= l^l^ M - m [Pz ■ \p( x i) ■P{y\xi) 2 - a ,p{x 2 ) -p(y\x 2 ) . .} ) 

y z 

2-a 



< E M ' ( ]^E^ z ' { p ( Xl "> ■ p(.y\ x i) 2 ~ a 'P( x z) -p(y\ x 2) 2 - a ,-- ■ 

y \ z 

y \ % J 



M 

z / 
2-a 

(38) 



where the second inequality is due to Jensen, using the fact that the function q(t) = t 2 a 
is concave for 1 < a < 2. The last equality follows from the constraint □ 

The usefulness of this result stems from its generality. It holds for any source distribution 
and any transition probability matrix {p(y\x)}. This result is used in Section 3, along with 
tighter bounds on the capacity that can be achieved in several special cases. It will also be 
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shown that the application of this result to large alphabets yields non-trivial bounds. For 
Q(t) = — logi, Eq. (32) is equivalent to the following: 



C < lim ^77 log (^'E {Y,P( X ) ■ P(.V\ x )^j j 



logM + I(X;Y), 



(39) 



where C is the classical capacity. Therefore, Eq. (32 ) can be viewed as generalization of (39 ). 



Notice that the bound in (39) can be derived easily from (12). This simple bound states 



that a maximum amount of information is transferred to the decoder when the output of 
the deterministic encoder is uniformly distributed and independent of the side information. 



The proof of (39) is given in Appendix B. 



2.2 The generalized rate-distortion function for the uniform source dis- 
tribution 



In this subsection, we handle the left-hand side of (20 ), i.e., the generalized RD function. We 



start with a general characterization. Then, in Lemma [3j we give a closed- form expression 
for the generalized RD function of uniformly distributed sources w.r.t. general symmetric 
distortion measures. Finally, in Lemma [4| we provide an explicit expression of this function 
for the special case of the Hamming distortion measure. These results will be used in the 
next section to derive concrete lower bounds on the distortion, from the DPT we formulated 



in (20) 



By definition of the generalized mutual information: 

p(x) 



I®(X-X) = J>(x,x)Q 



p(x\x 



^^p(x)p(x\x)Q 



V 



^2^2p(x)p(x\x)Q 1 

X x 



p(x\x 
p(x\x 



J 



(40) 



where we have defined the following i^-dimensional vectors: 

p = [p(x),x e X] 
Px = \p(x\x),xex] 



(41) 
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and the following function: 

Wx) = J>(z)p(x|aOQ (J^j ■ ( 42 ) 

For any convex function Q, ^f(p x ) is a convex function. This can be shown easily by the 
same method we used to prove Lemma [TJ The generalized RD function is given by: 

fl«(£>)=inf|^*(^)|, (43) 

where the infimum is taken over all conditional distributions under the constraint: 

'^2'^2p(x)p(x\x)p(x, x) < D. (44) 

X x 

lQ(X;X) is, of course, convex in the set {Px}*=i- Thus, this is a standard problem of 
minimizing a convex function over a convex set under linear constraints. Generally, this 
optimization problem can be solved numerically by various algorithms (see, e.g., |23[ Chap. 
3]). 

In the next steps we will give analytic expressions to the generalized RD function under 
certain conditions. We refer to a distortion measure p(x, x) as symmetric if the rows of 
the distortion matrix, {p(x, x)}, are permutations of each other and the columns are per- 
mutations of each other. A uniformly distributed source is a source for which p(x) = — , 

K 

Mx £ X. 

Lemma 3. Consider a discrete source X, uniformly distributed over a finite alphabet X , 
and let Q(t), < t < cc, be any real-valued differentiable convex function. Then, EP(D) 
w.r.t. any symmetric distortion measure is given by: 

K / 1 \ 

(45) 



where {pk\k=i ^ s a P r obability distribution, which is given by the following equations (k = 
1,...,K): 

<^)-^ </ Gs) +Ai+A " =o - (46) 

where {pk}k=i are the elements of each row of the matrix {p(x,x)} and X\, X2, {pk\k=i are 
constants, chosen such that: 

K K 

X)pfc = l, Y,P k P k = D (47) 

k=l k=l 

fJ-k > 0, p k -Pk = 0, k = l,...,K 
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Notice that the equations (46) are decoupled, thus each pt can be calculated separately. 
The proof of Lemma [3] is given in Appendix C. 
Example. Taking Q(t) = t~ s , we get: 



+ l)p% + Ai + \2Pk ~ li-k = 0, 



(48) 



which is equivalent to the following: 



Pk = c(/U fc - X- Pk) s , 



(49) 



where specific value of A matches to a point on the generalized RD curve and c is a nor- 
malization factor. 

For the Hamming distortion measure, defined by: 



p{x,x) = 

we have the following closed-form expression: 



x = x 

1 X ^ X 



(50) 



Lemma 4. Consider a discrete source X, uniformly distributed over a finite alphabet X , 
and let Q(t), < t < oo be any real-valued convex function. Then, RP{D) w.r.t. the 
Hamming distortion measure is given by: 



b9(d) = (i-d)-q 



K(l - D) 



D-Q 



K - 1 
KD 



(51) 



Notice that Lemma [4] does not require the differentiability of the convex function Q. 
The proof is given in Appendix D. 

The general form of RP{D) enables the use of any convex function Q. These results 
make the Ziv-Zakai mechanism much more tractable, at least for the case of uniform sources. 
In addition, they provide direct solutions for a broad class of classical RD functions. We 
use these results in the next section to derive non-trivial bounds on the distortion of scalar 
coding in several cases. 

3 Applications 

In this section, we use the results of the Section 2 to derive lower bounds on the distortion 
in several cases. Non-trivial bounds are obtained using the convex function Q(t) = 

14 



a > 1, which was mentioned above. We assume that the source is uniformly distributed. 



Under these conditions, Eq. ( 19 ) becomes: 



V y{Pz 



K 



a-l 



(Pz-Py) - 1 ' 

where we have defined the following iT-dimensional vectors: 

Py = \p(y\x),x 6 X), 
p? y = \p(y\x) a ,x € X]. 



(52) 



(53) 



Applying (51) and (52) to (20), we get: 



b9(d) 



K 



a-l 



;i - d) a + 



< K a ~ 2 sup 



(it 



y,z 



(K - 1 
Pz -pg 
(ft • Py)"' 1 



(54) 



where the supremum is taken over all sets of positive vectors {p z } z Li that satisfy (Jsj) , in 
order to carry out continuous optimization. It is easy to see that the optimization is done 
over a convex set. The functions T y (p z ) are neither convex nor concave in this case, but 



we can use the general form of I®(X; Y, Z) given in ( 10 ). By simple substitution under the 
above conditions, G y (p z ) has the following form: 



Gy(p z ) = K 



a-l 



-ry —ry 

Pz-PX, 



(55) 



(Pz ■Py) a 

where p°: is the vector obtained from p z by raising each element to the power of a. Clearly, 
G y {p z ) = T y {p z ) for any deterministic encoder. The functions {G y (p z )} are convex as shown 
in Lemma [TJ Therefore, the supremum of I®(X; Y, Z) is attained on the boundary of the 
convex set. Finding the supremum on the boundary requires searching over all vertices 
of the set, i.e, over all sets of binary vectors {pz} z Li that satisfy (Ji|). The meaning is, of 
course, returning to discrete optimization and performing it over all deterministic encoders. 
Seemingly, this makes the mechanism above useless. However, at least for some cases, 
can be calculated directly, as shown in the following examples. In addition, we can 



upper bound by using (32). This upper bound may give us non-trivial bounds, as 
shown in Example 2. It is also shown to be very useful when handling large alphabets, as 
demonstrated in Section 3. Finally, notice that optimizing over a, separately for each rate, 
will produce the best lower bound on the achievable distortion at this rate. 
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Example 1 

The symmetric DMC is defined by: 

- { L7 < 56 > 

where fj,, e G [0, 1], fj, > e, and \x + (K — l)e = 1. The distortion measure we use is the 



Hamming distortion, defined in (50). In these conditions, the minimal achievable distortion 



of a scalar source code with a fixed-rate R = logM, is: 



D(M) = e(K — M). 



(57) 



The proof is given in Appendix E. Knowing the best achievable distortion in this case, we 



can compare it to the bounds we get from (54) to examine their quality. The generalized 



capacity (22) for this channel is given by: 



Pz-Py 



y,z 



(fz ■ Py. 



a-1 



(M z + fi/e - 1) 



+(K - M Z )M ; 



2-a 



K a - 2 e-su V \Y,<l a {M z 



(58) 



where M z , M z G {1, . . .,K — M + 1}, is the cardinality of A z , i.e., the number of source 
symbols that are encoded to z. Obviously, ^ z M z = K. Notice that the supremum is taken 
over all deterministic encoders, where each encoder is represented by a specific set {p 2 }^£i 



as defined in (16). The second equality is proved in Appendix F. The function q a (M z ) is 
concave for 1 < a < 2, and may be concave also out of this range, with dependence on the 
channel parameters, as shown in Appendix F. When q a (M z ) is concave, we can bound the 
supremum by taking equal M z 's, i.e., M z = K/M, Vz, and we get: 

(K/M + fi a /e a - 1) 



C Q < R a-l e 



(K/M + n/e - If" 1 



2-a 



(59) 
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If M divides K, this bound is achieved by any feasible encoder that partitions the source 
alphabet into equally sized subsets, thus the optimization is exact. An example for specific 
values of fi and e is presented in Fig. [2} The bound is compared with the bound obtained 



from the classical DPT (12), the bound obtained from the classical inequality Rwz(D) < 



logM, the bound obtained by using (32) and the exact solution of Eq. 57 The WZ RD 



function was calculated using the Blahut-Arimoto-type algorithm presented in |22j. Eq. 



(54) was optimized over a, for each M < K, so as to get the best lower bound on the 



distortion. We see that even the classical DPT gives us non-trivial lower bounds and that 




Figure 2: K = 4, /j, = 0.7. Plus - the lower bound obtained from (59). Circle - the lower 



bound obtained from the classical DPT (12). Star - the exact solution. Solid line - the 



lower bound obtained from R\yz(D). Square - the lower bound obtained from (32). 



the lower bound obtained from (59) is much better than the trivial bound obtained from 



Rwz{D). The lower bound obtained from (32) is not useful in this case. There is a gap 



between the exact solution and the best bound, even for M = 2, where the optimization 



(22) is exact. 



Example 2 

The symmetric DMC is defined by: 

Piv\x) = { ^ V , € mod K, . . . ,(x + I — 1) mod K} (6Q) 

where I is an integer, < I < K, and K mod K is defined to be K. Given an input x, the 



channel produces one of I values with equal probability. The generalized capacity (22) for 
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this channel is given by: 



K a-2 



y.z 

-^//■z(l/Q 0t 

/i)T~ ] 



• sup { > 

= K^-r 1 - sup |E M y 2 ; a }. ( 61 ) 

where My )Z = Z • [p z • p y ]. It is easy to see that M VjZ = I. For 1 < a < 2, the function 
My~ a is concave in M z . Thus, the supremum is achieved by setting M y>z = l/M, V{y, z}: 

C Q = K a-2 .1-1. K-M- (l/M) 2 - a = K a - 1 ■ (M/0 a_1 . (62) 

If M divides I, equal My z 's can be obtained by the following feasible encoder: 

z = f( x ) = l + x mod M. (63) 

Therefore, in this case, the optimization is exact. For a > 2, C"* is infinite, because we 
can always set some M y ^ z to by an appropriate choice of the encoder. Thus, this range of 
a does not lead to a useful bound. An example for specific values of K and I is presented 
in Fig. [3] The lower bound on the distortion, which coincides with the bound obtained 



from (32) (the upper bound on is tight for this channel), is compared with the bound 



obtained from the classical DPT (12) and the bound obtained by the classical inequality 
Rwz{D) < logM. Eq. (54) was optimized over a, for each M < K, so as to get the 
best lower bound on the distortion. We see that in this case, the generalized DPT leads to 
bounds that are better than the trivial bound, whereas the classical DPT does not lead to 



a useful bound. We also present the exact distortion of the encoder defined in (63), which 
is, of course, an upper bound on the distortion. Thus, the distortion of the optimal encoder 
must be in the range between this upper bound and our highest lower bound. For M = I, 



zero distortion can indeed be achieved using the encoder defined in (63), thus our lower 
bound at this point is tight. 

Large alphabets 

In this part we show that as the alphabet size increases, we obtain interesting bounds 
on the performance of scalar coding. These useful bounds can be obtained for a large 
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2 R 



Figure 3: K = 4, I = 3. Square - the lower bound obtained from (59) and (32). Circle - the 



lower bound obtained from the classical DPT (12). Solid line - the lower bound obtained 



from R\yz{D). Star - the exact distortion of the encoder (63). 



variety of channels, without any symmetry requirements. The results are obtained using the 
upper bound on the "capacity", presented in Lemma [2] This bound becomes tighter as the 
alphabet size increases, for various channels. For these channels, we can get close to the last 



1 1 11 T 

upper bound in Eq. (38), i.e., to achieve {p z ■ \p{x\) ■ p{y\x\) 2 ~ a ,p(x2) ■ p{y\x2) 2 - a , • ■ -]} 2= i 
which are almost equal to each other, by a suitable choice of encoder. This is because 
{p(y\x)} is composed of large number of probabilities with small values. Using the bound 
of Lemma [2j we bypass the problem of optimizing the capacity for general channels. As was 
mentioned earlier in Subsection 2.3, this optimization is in general a convex maximization 
problem which requires searching over all possible encoders. Using our lower bounds along 
with simple upper bounds, we give the performance range for scalar coding. These results 
are, of course, interesting from the practical point of view. 

In the following examples we assume that the sources are uniform. This is because 
analytic expression for the generalized RD function of general sources is not available. We 
use the Hamming distortion measure for convenience. Bounds for more general distortion 
measures can be calculated using the result of Lemma [3} In the two former examples, 
we compare our results to the WZ RD function. This function was calculated using the 
algorithm presented in [22] . However, the computational complexity of this algorithm is of 
order K . Therefore, this algorithm is not practical for large alphabets. Since no other 
efficient algorithms are known, the computation of the WZ RD function for large alphabets 
is problematic. As a result, even the trivial bound obtained from Rwz(D) < logM does 
not lead to a closed-form expression. This makes our results even more interesting. 
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Instead of comparing our distortion bounds to the bounds obtained from R\yz(D) < 
log M , we compare them to the following linear function: 

D(R) = D max (l - tttIU^ , (64) 



. H(X\Y)) 

where d max = R7^ z (0), i.e., the lowest achievable distortion for rate R = 0. The function 
D(R) is simply the straight line obtained by time-sharing the two known endpoints of the 
RZ7 Z {R) curve, (0, D max ) and (H(X\Y),0). This line is, of course, a trivial upper bound 
on R^ Z (R). As an upper bound on the best achievable distortion of a fixed-rate code, we 



use the performance of the code composed of the encoder defined in (63), along with the 
corresponding optimal decoder, given by: 

x = g(z,y) = argmax{p(y|x)}. (65) 

x£A z 

Remember that the optimal decoding strategy is maximum likelihood because we use the 



Hamming distortion measure. The choice of the encoder (63) seems natural when handling 
channels with transition probabilities that decrease with the distance. In this case, we 
want adjacent symbols to be in different subsets of the encoder. Obviously, any code that 
performs better will improve the performance range. 
In the first example, the DMC is defined by: 

p(y\x) = c x exp (- ^ V } X , (66) 



2a 2 

where c x is a normalization factor such that ^2 y p{y\x) = 1. This is a 'Gaussian'-like 
channel. Notice that this channel is not symmetric. Performance ranges for different values 
of K are presented in Fig. 4. We see that we get bounds that are higher than D(R) and 
therefore, higher than R^ Z (R). These bounds also show that the time-sharing of D(R) 
performs better than any fixed-rate code for considerable range of rates. Similar results can 
be presented for various channels, not necessarily additive. 



In the second example, the DMC is the same as in Example 2 and is defined by (60). 
In this case, H(X\Y) = log/. We now present the performance range for large alphabets 
where in all cases we take / = K/A. The results are shown in Fig. 5. Again, we see that our 
bounds are higher than D(R). As the alphabet size increases, the gap between our bound 
and D(R) also increases. 
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In all examples, one can easily notice that the lower convex envelope of our lower bounds, 
is very close to the straight line D(R). As was shown in Section 2, this convex envelope is 
a lower bound on the distortion of time-sharing fixed-rate codes. 
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(a) K = 64, a = 0.5 




R [Bits] 

(b) K = 128, a = 0.5 




R [Bits] 

(c) if = 256, a = 0.5 



Figure 4: The performance range for the channel (66) and different sizes of alphabets. 



Straight line - D(R). Lower curve - our lower bound. Higher curve - the distortion of the 
code (63). 
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R [Bits] 

(a) K = 64, I = 16 




R [Bits] 

(b) K = 128, I 



32 




(c) K = 256, I = 64 



Figure 5: The performance range for the channel (60) and different sizes of alphabets and 



Vs. Straight line - D(R). Lower curve - our lower bound. Higher curve - the distortion of 
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the code (63). 



4 Summary, Conclusions and Future Directions 

In this paper, we presented the relevant random variables of the WZ problem as a Markov 
chain. As far as we know, this is a new formulation in the context of the WZ setting, which 
enabled us to use the DPT and its generalizations. We then focused on the generalized DPT 
presented in |15j . We found the generalized RD function of uniform sources, for any convex 
function Q and a broad class of distortion measures. We also found a useful upper bound for 
the generalized capacity in the setting above and calculated this capacity exactly for several 
interesting channels. We then showed that replacing the logarithmic function with other 
functions, in the DPT we formulated, yield better bounds on the distortion of scalar coding 
in the WZ setting. Examples of non-trivial lower bounds for scalar coding in this setting 
were given. For large alphabets, we demonstrated that non-trivial bounds can be achieved 
for various channels and as a result, the performance range for scalar coding can be given. 
We also saw that simple time-sharing between the two endpoints of the WZ RD curve, may 
perform better than any fixed-rate code in various cases. As far as we know, these bounds 
are the only existing non-trivial bounds for this situation. Clearly, these results are relevant 
from the practical point of view. 

Analytic expressions for the generalized RD function of general sources are not apparent 
to be available. Improving the bounds above by using, for example, the techniques of |21j . 
is yet to be explored. In the next step, a possible direction will be to extend our setting to 
more general scenarios. The first interesting scenario is the variable-rate coding case, where 



Z is encoded by a variable-length code. In this setting, the generalized capacity ( 22 ) will be 
maximized under the constraint E{L(Z)} < R, where L(Z) is the length of the codeword 
Z. The best variable-rate code that can be used is the Huffman code for p(z), provided 
that p(x, y) > for all x, y G X . The challenge is to perform this optimization. Another 
interesting scenario is when the output of the encoder is transmitted over a noisy channel 
(instead of the noiseless one in the WZ setting). In this case, the generalized capacity will be 
of the form I®(X, Z; Z',Y), where Z' is the output of the encoder noisy channel. Again, the 
challenge will be to optimize this capacity. Another direction is to extend the mechanism 
above to settings of coding with memory. In these scenarios, the states of the encoder and 
the decoder will be allowed to depend on past inputs. Using generalized DPTs, we hope to 
get useful bounds for this case. 
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Appendix A - Proof of Lemma [T] 



Proof. We defined: 



G y(Pz) = 'Yl l v{x)p{y\x)p{z\x)Q 



Pz -p y 



p(y\x)p(z\x) J 



(A.l) 



In order to prove the convexity of G y (p z ), we use the following known result (cf., e.g., |23j): 
Gy{p z ) is convex if and only if dom(G y ) is convex, and the function g : M — > M, defined by: 



g(t) = G y (f+ vt) , dom(g) = {t : f + vt £ dom(G y )} , 
is convex in t, for any r £ dom(Gy), v € M. K . Substituting p z = r + vt, we get: 

M = E^|x)(, + ^q( |^ ) 



(A.2) 



x=l 



(A.3) 



where {r x }^ 1 and {vzjiiLi are the elements of r and respectively, and 



a„ = r ■ 



Py, 



by = v ■ Py are constants. Linear combination of convex functions is convex. Thus, it is 
enough to show the convexity of the following functions, x = {1, . . . , K}: 



fx(t) = p{y\x){r x + v x t)Q 



a y + b y t 



K p{y\x)(r x + v x t) 
h(a y + b y t,p(y\x)(r x + v x t)) , 



(A.4) 



where h(u,s) = sQ{u/s) is the perspective (cf., e.g., [23]) of the convex function Q(u), 
and thus convex. The function f x {t) is the restriction of the convex function h(u, s) to the 
straight line 

{u = p(y\x)(r x + v x t), s = a y + b y i}. Therefore, f x (t) is convex. □ 



Appendix B - Proof of Eq. (39) 



Proof. It is enough to calculate the limit: 



i log (y! (y! p ^ ■ 2 ~ a j 

log (j2 (^2p( x ) ■p{y\ x ) 1I= ^j 



(B.l) 
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where the equality is due to L'Hopital's rule, using the fact that both the numerator and 
denominator tend to as a — > 1. Noticing that the expression in the brackets has the form 
of the Gallager function (see, e.g., [2H Chap. 5]) for the given DMC, defined as 



Eq(p,p) = - log (^2p( x ) •p(yk) 1 + p ^ j > 



(B.2) 



we can use the following property of Eq(p, p): 

8E (p, p) 



dp 



I(X;Y) 



p=0 



to get: 



d 

da 



I(X;Y), 



a=l 



which completes the proof. 



(B.3) 

(B.4) 
□ 



Appendix C - Proof of Lemma [3] 



The idea of the proof is to exploit the symmetry of the distortion matrix. From symmetry, 
we expect that each input symbol will have the same set of transition probabilities. 

Proof. We define p(i\j) = Pfax^M)- By definition: 



I®(X-X) = ^p(x,x)Q 



P{x) 
p(x\x) 



K 



pjx = j) 
P(j\i) 



(C.l) 



i=l i=l j=£i 

Each row of the distortion matrix contains the same K values. We enumerate these values as 
{di, c?2, • • • , dx} where d\ = d(x, x) = 0. Without loss of generality and only for convenience 
of the proof, we assume that we have K different values. We define: 



x k (x) = {x G X : p(x,x) = p k }- 



(C.2) 



In words, X)~(x) is the unique alphabet symbol with distortion p^ relative to x. Using this 

'p{x = x k {i))' 



definition, we can write (C.l) in the following way: 
I Q {X;X) 



K 



i=l k=2 



p(x k (i)\i) 



(C.3) 
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We define: 

K 



^J>(£*(i)l*)> ke{2,...,K} 



Pk 

K K 



(C.4) 



Pl : K 

i=l k=2 



Using these definitions, the distortion is given by: 

. K K K 

D = ^^2^2p{x k {i)\i)pk = ^PkPk (C.5) 

k=2 i=l k=2 

Applying the Jensen inequality, we get: 

- ^A.Kp^V P(i|<) ) h h K ^ V p(%«K) 

/ /v p(i|i)p(x = i)\ v-^ /-r-^p(Kjfc(i)|i)p(x = £jfc(i))^ 



#PiP(*l*) / ^ Kp k p{x k {i)\i) 

= Pl9 (^) + fr fcQ (^) 
* / 1 \ 

(C.6) 



Notice that the sum = is running over all values of x due to the symmetry of 



i=i 



the distortion matrix (pj. appears in each column) and thus equal to 1. The lower bound 



in (C.6) is achieved by a channel of the form: 

p{x k {i)\i) = Pk , ke{2,...,K}, i G {1, 2, . . . , K}. (C.7) 

This channel achieves, of course, the same distortion D, which depends only on {pk}- We 
also have for this channel: 

1 1 K 1 

p [ x = i) = — p [i\i) + — ^ Pk = — (c.8) 

k=2 
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Substituting ([C7f|) and §Clty in ((03)), we get: 
I Q {X;X) 



K 



i 



A' 



A 



p{x k {i)\i) 



i=l v i=l fc=2 v * 



A 

K 



fe=l 



Kp k 



(C.9) 



which is exactly the lower bound. In summary, we showed that the channel that minimizes 



among all channels with the same {pk}k=v is of the form ( |C,7[ ). Therefore, to 
get the RD function, it is enough to optimize (C.9) over all probability measures {pk\k=v 



subject to the constraint (C.5). The generalized mutual information I®(X; X) is, of course, 
convex in {pk}^ = \- Thus, this is a standard convex minimization under linear constraints 
problem and the solution is given by the Karush-Kuhn- Tucker conditions, which are exactly 
d46b and d47b. □ 



Appendix D - Proof of Lemma [4] 



Proof. Again, the idea of the proof is to exploit the symmetry of the source and the Ham- 
ming distortion. We assume that the source X is uniformly distributed over X, and use the 
Hamming distortion measure. Under these conditions, the distortion D is given by: 

A 



D = E p( i )p(j\ i )p( i ^) 



A 



where we have defined p(i) = Px{i), p(j\i) = ^xix^'l*)' an ^ : 



(D.l) 



D i = E^^l 2 



(D.2) 



Thus we have the following: 

P(% = j) = J7 E P ^'I*) = JC E P ^I^ + J?PU\j) = J? \ D 3 +P{j\j)\ ( D - 3 ) 
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and 



it 



1 ~ D = ^Ef(v. 

i 



Calculating : 
I Q (X;X) = 



1 

K 



^2p(i\i)Q 



p(x) 
p(x\x) t 

1 ( A: 



(D.4) 



i j^i 



> (1-D)-Q 

= (1~D)-Q 
= {l-D)-Q 



K \p(i\i) 
1 



+ 1 



(1 - D)K 2 

EE*i 

i j^i 

D + l-D' 
K(l - D) 

1 

K(l - D) 



E(A+P(i|0) 

A+p(ili) 



+ 



P(j\i) 



+ D-Q 



+ D-Q 



1 



DK 

K-l 
KD 



E D 



A' 



+ !-£> 



if 



(D.5) 



The second equality is due to (D.3). The inequality is obtained by applying the Jensen 



inequality to each one of the two weighted sums, after normalizing the weights using (D.4). 



The next equalities are obtained by calculating the sums that appear as arguments of the 



function Q using (D.4) and by simple algebraic manipulations. It is easy to show, by simple 



substitution in I(X;X), that this lower bound is achieved , for any convex function Q(t), 
by the following symmetric channel: 



p(x\x) 




(D.6) 

□ 



Appendix E - Proof of Eq. (57) 



Proof. We assume that the source is uniformly distributed and that the DMC is given by 



(56). The Hamming distortion is equal to the average probability of error. Therefore, 



given a scalar encoder of fixed rate R = log M, the optimal decoding strategy is, of course, 
maximum likelihood: 

x = g(z,y) = argmax{p(y|a;)} (E.l) 

x£A z 
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For the channel (56), the decoder gets the form: 



x = { V yeAz (E.2) 

1 choose x' £ A z uniformly at random y A z 

Given x £ A z , we have two error events. The first error event is when y S A z and y ^ x. 
The probability of such an event is (M z — l)e. The other error event is when y ^ A z 
and x' ^ x. The probability of this event is the product Pr{y ^ • Pr{x' ^ x\x] = 

(K - M z )e ■ (M z - l)/M z . Thus, the distortion is given by: 

d = — Pr{error|a;} 

X 

= ^ Yl " X ) £ + ( K ~ M MM Z - 1)/M Z ] 

z x£A z 

= ^J2M Z [(M Z -1) + (K-M Z )(M Z -1)/M Z ] 

z 

= ^J2[M Z -(M Z -1) + (K-M Z )(M Z -1)] 

z 

= ^^(M z - l)[M z + K - M z ] 

z 

z 

= e(K — M) 

= e(K-2 R ). (E.3) 

Notice that d(R) is a decreasing concave function. Thus, by time-sharing an encoder with 
R = log K and an encoder with R = 0, we can achieve the straight line d(R) = e{K — 1) (1 — 
R/logK) and outperform any fixed-rate encoder. □ 



Appendix F - The concavity of q a (x) defined in Eq. (58) 



We start with explaining the following equality for the channel (56): 



vv,/-;f 8 = vm, '"^^-Viif-M.iM,- 



z y 



= Yq a (M z ). (F.l) 

z 

The sum over y is calculated by noticing that the product \p z ■ p y ], and, respectively, the 
product \p z ■ fly], can have one of two results. If the l's in the binary vector p z overlap only 
e's in p y , we get [p z ■ p y ] = M z ■ e and respectively, \p z ■ p^} = M z ■ e a . Otherwise, we get 
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\Pz • Py] = A* + (Ms — 1) • e and respectively, [p 2 • p^] =/t a + (M 2 — 1) • e a . It is not hard to 
see that the second result will occur exactly M z times in the sum on y, and thus the first 
will occur exactly K — M z times. The rest is straightforward. We now prove the concavity 
of the function q a (x), 1 < x < K — M + l. The second derivative of q a (x) is (c Q = fi a /e a ): 



ql{x) = 2(2 - a) (x + fi/e - I) 1 "" - x(2 - a) (a - 1) (x + /i/e - 1)"" 

— 2c a (a — 1) (x + fi/e — + x • c a • q(q — 1) (x + /x/e — l) _a_1 
-2(2 - a)x^ a - (2 - q)(K - x){a - l)x~ a 

< 2(2 - a)x l - a - x(2 - a)(a - 1) (x + fi/e - l)~ a 

— 2c a (a — 1) (x + ji/e — + x • c a • a(a — 1) (x + /i/e — l) _a_1 

-2(2 -^x 1 "". (F.2) 



The inequality follows from the assumption (i > e, which leads tox<x + ///e— 1, and from 
the fact that the last term in the derivative is negative. Doing some algebraic manipulations, 
we get: 



— 2c a (a — 1) (x + fi/e — l)~ a + -c a • a(a — l)x (x + /i/e — 1) 
oc —(2 — a)x (x + /i/e — 1) — 2c a (x + /i/e — 1) + x • c a • a 
= — [(2 — a)x + 2c Q ] (x + /i/e — 1) + x • c a • a 
< — [(2 — a)x + 2c a ] x + x • c a • a 
oc —(2 — a)x — 2c Q + c a • a 
= —(2 — a)x — c a (2 — a) 



Notice that the constant of proportionality is positive in all cases. We showed that q a (x) 
is concave for 1 < a < 2. Checking the concavity /convexity out of this range can be done 
numerically, by simple calculation of q" a (x) . 




-(2 - a)(a- l)x (x + /i/e - 1) 



—a 



< 0. 



(F.3) 
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